Why waste sorting?

Recycling contamination occurs when waste is incorrectly disposed of - like recycling a pizza box with oil on it (compost). Or when waste is correctly disposed of but incorrectly prepared - like recycling unrinsed jam jars.

Contamination is a huge problem in the recycling industry that can be mitigated with automated waste sorting. Just for kicks, I thought I'd try my hand at prototyping an image classifier to classify trash and recyclables - this classifier could have applications in an optical sorting system.

Building an image classifier

In this project, I'll train a convolutional neural network to classify an image as either cardboard, glass, metal, paper, plastic, or trash with the fastai library (built on PyTorch). I used an image dataset collected manually by Gary Thung and Mindy Yang. Download their dataset here to follow along, then move it to the same directory as this notebook. (Note: you'll want to use a GPU to speed up training.)

My modeling pipeline:

  1. Download and extract the images
  2. Organize the images into different folders
  3. Train model
  4. Make and evaluate test predictions
  5. Next steps
In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

%config InlineBackend.figure_format = 'retina'
In [2]:
from fastai.vision import *
from fastai.metrics import error_rate
from pathlib import Path
from glob2 import glob
from sklearn.metrics import confusion_matrix

import pandas as pd
import numpy as np
import os
import zipfile as zf
import shutil
import re
import seaborn as sns

1. Extract data

First, we need to extract the contents of "dataset-resized.zip".

In [3]:
files = zf.ZipFile("dataset-resized.zip",'r')
files.extractall()
files.close()

Once unzipped, the dataset-resized folder has six subfolders:

In [4]:
os.listdir(os.path.join(os.getcwd(),"dataset-resized"))
Out[4]:
['.DS_Store', 'cardboard', 'glass', 'metal', 'paper', 'plastic', 'trash']

2. Organize images into different folders

Now that we've extracted the data, I'm going to split images up into train, validation, and test image folders with a 50-25-25 split. First, I'll define some functions that will help me quickly build it. If you're not interested in building the data set, you can just run this ignore it.

In [5]:
## helper functions ##

## splits indices for a folder into train, validation, and test indices with random sampling
    ## input: folder path
    ## output: train, valid, and test indices    
def split_indices(folder,seed1,seed2):    
    n = len(os.listdir(folder))
    full_set = list(range(1,n+1))

    ## train indices
    random.seed(seed1)
    train = random.sample(list(range(1,n+1)),int(.5*n))

    ## temp
    remain = list(set(full_set)-set(train))

    ## separate remaining into validation and test
    random.seed(seed2)
    valid = random.sample(remain,int(.5*len(remain)))
    test = list(set(remain)-set(valid))
    
    return(train,valid,test)

## gets file names for a particular type of trash, given indices
    ## input: waste category and indices
    ## output: file names 
def get_names(waste_type,indices):
    file_names = [waste_type+str(i)+".jpg" for i in indices]
    return(file_names)    

## moves group of source files to another folder
    ## input: list of source files and destination folder
    ## no output
def move_files(source_files,destination_folder):
    for file in source_files:
        shutil.move(file,destination_folder)

Next, I'm going to create a bunch of destination folders according to the ImageNet directory convention. It'll look like this:

/data
     /train
         /cardboard
         /glass
         /metal
         /paper
         /plastic
         /trash
     /valid
         /cardboard
         /glass
         /metal
         /paper
         /plastic
         /trash
    /test

Each image file is just the material name and a number (i.e. cardboard1.jpg)

Again, this is just housekeeping to organize my files.

In [6]:
## paths will be train/cardboard, train/glass, etc...
subsets = ['train','valid']
waste_types = ['cardboard','glass','metal','paper','plastic','trash']

## create destination folders for data subset and waste type
for subset in subsets:
    for waste_type in waste_types:
        folder = os.path.join('data',subset,waste_type)
        if not os.path.exists(folder):
            os.makedirs(folder)
            
if not os.path.exists(os.path.join('data','test')):
    os.makedirs(os.path.join('data','test'))
            
## move files to destination folders for each waste type
for waste_type in waste_types:
    source_folder = os.path.join('dataset-resized',waste_type)
    train_ind, valid_ind, test_ind = split_indices(source_folder,1,1)
    
    ## move source files to train
    train_names = get_names(waste_type,train_ind)
    train_source_files = [os.path.join(source_folder,name) for name in train_names]
    train_dest = "data/train/"+waste_type
    move_files(train_source_files,train_dest)
    
    ## move source files to valid
    valid_names = get_names(waste_type,valid_ind)
    valid_source_files = [os.path.join(source_folder,name) for name in valid_names]
    valid_dest = "data/valid/"+waste_type
    move_files(valid_source_files,valid_dest)
    
    ## move source files to test
    test_names = get_names(waste_type,test_ind)
    test_source_files = [os.path.join(source_folder,name) for name in test_names]
    ## I use data/test here because the images can be mixed up
    move_files(test_source_files,"data/test")
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 

I set the seed for both random samples to be 1 for reproducibility. Now that the data's organized, we can get to model training.

In [7]:
## get a path to the folder with images
path = Path(os.getcwd())/"data"
path
Out[7]:
WindowsPath('C:/Users/13607/data')
In [8]:
tfms = get_transforms(do_flip=True,flip_vert=True)
data = ImageDataBunch.from_folder(path,test="test",ds_tfms=tfms,bs=16)

The batch size bs is how many images you'll train at a time. Choose a smaller batch size if your computer has less memory.

You can use get_transforms() function to augment your data. I'll compare the results from flipping images horizontally and vertically.

In [9]:
data
Out[9]:
ImageDataBunch;

Train: LabelList (1262 items)
x: ImageList
Image (3, 384, 512),Image (3, 384, 512),Image (3, 384, 512),Image (3, 384, 512),Image (3, 384, 512)
y: CategoryList
cardboard,cardboard,cardboard,cardboard,cardboard
Path: C:\Users\13607\data;

Valid: LabelList (630 items)
x: ImageList
Image (3, 384, 512),Image (3, 384, 512),Image (3, 384, 512),Image (3, 384, 512),Image (3, 384, 512)
y: CategoryList
cardboard,cardboard,cardboard,cardboard,cardboard
Path: C:\Users\13607\data;

Test: LabelList (635 items)
x: ImageList
Image (3, 384, 512),Image (3, 384, 512),Image (3, 384, 512),Image (3, 384, 512),Image (3, 384, 512)
y: EmptyLabelList
,,,,
Path: C:\Users\13607\data
In [10]:
print(data.classes)
['cardboard', 'glass', 'metal', 'paper', 'plastic', 'trash']

Here's an example of what the data looks like:

In [11]:
data.show_batch(rows=4,figsize=(10,8))

3. Model training

In [12]:
learn = create_cnn(data,models.resnet34,metrics=error_rate)
C:\ProgramData\Anaconda3\lib\site-packages\fastai\vision\learner.py:109: UserWarning: `create_cnn` is deprecated and is now named `cnn_learner`.
  warn("`create_cnn` is deprecated and is now named `cnn_learner`.")

What is resnet34?

A residual neural network is a convolutional neural network (CNN) with lots of layers. In particular, resnet34 is a CNN with 34 layers that's been pretrained on the ImageNet database. A pretrained CNN will perform better on new image classification tasks because it has already learned some visual features and can transfer that knowledge over (hence transfer learning).

Since they're capable of describing more complexity, deep neural networks should theoretically perform better than shallow networks on training data. In reality, though, deep neural networks tend to perform empirically worse than shallow ones.

Resnets were created to circumvent this glitch using a hack called shortcut connections. If some nodes in a layer have suboptimal values, you can adjust weights and bias; if a node is optimal (its residual is 0), why not leave it alone? Adjustments are only made to nodes on an as-needed basis (when there's non-zero residuals).

When adjustments are needed, shortcut connections apply the identity function to pass information to subsequent layers. This shortens the neural network when possible and allows resnets to have deep architectures and behave more like shallow neural networks. The 34 in resnet34 just refers to the number of layers.

Anand Saha gives a great more in-depth explanation here.

In [13]:
learn.model
Out[13]:
Sequential(
  (0): Sequential(
    (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (4): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (5): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (6): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (4): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (5): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (7): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (1): Sequential(
    (0): AdaptiveConcatPool2d(
      (ap): AdaptiveAvgPool2d(output_size=1)
      (mp): AdaptiveMaxPool2d(output_size=1)
    )
    (1): Flatten()
    (2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.25, inplace=False)
    (4): Linear(in_features=1024, out_features=512, bias=True)
    (5): ReLU(inplace=True)
    (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): Dropout(p=0.5, inplace=False)
    (8): Linear(in_features=512, out_features=6, bias=True)
  )
)

Finding a learning rate

I'm going to find a learning rate for gradient descent to make sure that my neural network converges reasonably quickly without missing the optimal error. For a refresher on the learning rate, check out Jeremy Jordan's post on choosing a learning rate.

In [14]:
learn.lr_find(start_lr=1e-6,end_lr=1e1)
learn.recorder.plot()
50.00% [1/2 01:45<01:45]
epoch train_loss valid_loss error_rate time
0 3.069236 #na# 01:45

10.26% [8/78 01:06<09:38 8.0032]
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

The learning rate finder suggests a learning rate of 5.13e-03. With this, we can train the model.

Training

In [15]:
learn.fit_one_cycle(20,max_lr=5.13e-03)
epoch train_loss valid_loss error_rate time
0 1.651198 0.754808 0.271429 01:46
1 1.104332 0.519124 0.176190 01:38
2 0.923132 0.490434 0.149206 01:38
3 0.861049 0.588064 0.177778 01:40
4 0.815407 0.630139 0.193651 01:38
5 0.785332 0.694489 0.217460 01:41
6 0.782573 0.681486 0.209524 01:40
7 0.740195 0.611430 0.215873 01:40
8 0.715574 0.552514 0.165079 01:40
9 0.562522 0.399169 0.123810 01:40
10 0.542520 0.368121 0.115873 01:40
11 0.507928 0.277835 0.085714 01:40
12 0.413242 0.275061 0.101587 01:40
13 0.307766 0.284955 0.093651 01:40
14 0.300517 0.252112 0.087302 01:42
15 0.240274 0.218749 0.065079 01:41
16 0.246420 0.202174 0.061905 01:40
17 0.189184 0.178598 0.053968 01:41
18 0.192909 0.183421 0.060317 01:45
19 0.174008 0.175872 0.057143 01:40
In [ ]:
 

I ran my model for 20 epochs. What's cool about this fitting method is that the learning rate decreases with each epoch, allowing us to get closer and closer to the optimum. At 8.6%, the validation error looks super good... let's see how it performs on the test data though.

First, we can take a look at which images were most incorrectly classified.

VIsualizing most incorrect images

In [16]:
interp = ClassificationInterpretation.from_learner(learn)
losses,idxs = interp.top_losses()
In [17]:
interp.plot_top_losses(9, figsize=(15,11))

The images here that the recycler performed poorly on were actually degraded. It looks the photos received too much exposure or something so this actually isn't a fault with the model!

In [18]:
doc(interp.plot_top_losses)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

This model often confused plastic for glass and confused metal for glass. The list of most confused images is below.

In [19]:
interp.most_confused(min_val=2)
Out[19]:
[('glass', 'metal', 6),
 ('plastic', 'metal', 5),
 ('glass', 'plastic', 4),
 ('plastic', 'glass', 3),
 ('paper', 'cardboard', 2),
 ('paper', 'trash', 2),
 ('plastic', 'paper', 2),
 ('trash', 'glass', 2),
 ('trash', 'paper', 2)]

4. Make new predictions on test data

To see how this mode really performs, we need to make predictions on test data. First, I'll make predictions on the test data using the learner.get_preds() method.

Note: learner.predict() only predicts on a single image, while learner.get_preds() predicts on a set of images. I highly recommend reading the documentation to learn more about predict() and get_preds().

In [20]:
preds = learn.get_preds(ds_type=DatasetType.Test)

The ds_type argument in get_preds(ds_type) takes a DataSet argument. Example values are DataSet.Train, DataSet.Valid, and DataSet.Test. I mention this because I made the mistake of passing in actual data (learn.data.test_ds) which gave me the wrong output and took embarrassingly long to debug.

Don't make this mistake! Don't pass in data -- pass in the dataset type!

In [21]:
print(preds[0].shape)
preds[0]
torch.Size([635, 6])
Out[21]:
tensor([[9.9886e-01, 3.7543e-05, 1.4676e-06, 1.0957e-03, 4.4132e-06, 5.5016e-06],
        [9.9996e-01, 2.5232e-06, 1.1433e-06, 3.2161e-05, 6.1419e-06, 1.7053e-06],
        [9.9997e-01, 2.0341e-08, 5.4533e-07, 1.1986e-05, 1.9920e-06, 1.9813e-05],
        ...,
        [6.9210e-05, 6.4342e-06, 1.7694e-05, 1.6822e-01, 5.1728e-05, 8.3163e-01],
        [8.2820e-06, 1.5741e-07, 5.4146e-05, 3.8663e-01, 4.0825e-05, 6.1327e-01],
        [8.4705e-02, 1.9075e-03, 1.7295e-03, 2.8307e-01, 4.4888e-03, 6.2410e-01]])

These are the predicted probablities for each image. This tensor has 365 rows -- one for each image -- and 6 columns -- one for each material category.

In [22]:
data.classes
Out[22]:
['cardboard', 'glass', 'metal', 'paper', 'plastic', 'trash']

Now I'm going to convert the probabilities in the tensor above to a string with one of the class names.

In [23]:
## saves the index (0 to 5) of most likely (max) predicted class for each image
max_idxs = np.asarray(np.argmax(preds[0],axis=1))
In [24]:
yhat = []
for max_idx in max_idxs:
    yhat.append(data.classes[max_idx])
In [25]:
yhat
Out[25]:
['cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'paper',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'paper',
 'cardboard',
 'cardboard',
 'metal',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'paper',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'paper',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'cardboard',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'plastic',
 'glass',
 'glass',
 'paper',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'plastic',
 'glass',
 'plastic',
 'metal',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'metal',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'plastic',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'metal',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'plastic',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'metal',
 'glass',
 'glass',
 'metal',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'glass',
 'metal',
 'glass',
 'metal',
 'cardboard',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'glass',
 'metal',
 'metal',
 'metal',
 'plastic',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'glass',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'trash',
 'metal',
 'metal',
 'glass',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'glass',
 'metal',
 'metal',
 'metal',
 'metal',
 'plastic',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'trash',
 'metal',
 'metal',
 'metal',
 'glass',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'metal',
 'trash',
 'metal',
 'metal',
 'metal',
 'metal',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'trash',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'cardboard',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'trash',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'paper',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'glass',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'glass',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'paper',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'trash',
 'trash',
 'plastic',
 'plastic',
 'plastic',
 'trash',
 'trash',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'glass',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'plastic',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'cardboard',
 'plastic',
 'trash',
 'trash',
 'paper',
 'paper',
 'trash',
 'trash',
 'trash',
 'trash',
 'trash',
 'paper',
 'trash',
 'trash',
 'plastic',
 'trash',
 'trash',
 'trash']

These are the predicted labels of all the images! Let's check if the first image is actually glass.

In [26]:
learn.data.test_ds[0][0]
Out[26]:

It is!

Next, I'll get the actual labels from the test dataset.

In [27]:
y = []

## convert POSIX paths to string first
for label_path in data.test_ds.items:
    y.append(str(label_path))
    
## then extract waste type from file path
pattern = re.compile("([a-z]+)[0-9]+")
for i in range(len(y)):
    y[i] = pattern.search(y[i]).group(1)

A quick check.

In [28]:
## predicted values
print(yhat[0:5])
## actual values
print(y[0:5])
['cardboard', 'cardboard', 'cardboard', 'cardboard', 'cardboard']
['cardboard', 'cardboard', 'cardboard', 'cardboard', 'cardboard']
In [29]:
learn.data.test_ds[0][0]
Out[29]:

It looks the first five predictions match up! (check)

How did we end up doing? Again we can use a confusion matrix to find out.

Test confusion matrix

In [30]:
cm = confusion_matrix(y,yhat)
print(cm)
[[ 96   0   1   4   0   0]
 [  0 115   5   1   5   0]
 [  1   6  91   0   2   3]
 [  1   0   0 146   0   2]
 [  0   3   0   1 113   4]
 [  1   0   0   3   2  29]]

Let's try and make this matrix a little prettier.

In [31]:
df_cm = pd.DataFrame(cm,waste_types,waste_types)

plt.figure(figsize=(10,8))
sns.heatmap(df_cm,annot=True,fmt="d",cmap="YlGnBu")
Out[31]:
<matplotlib.axes._subplots.AxesSubplot at 0x172cafdf808>

Again, the model seems to have confused metal for glass and plastic for glass. With more time, I'm sure further investigation could help reduce these mistakes.

In [32]:
correct = 0

for r in range(len(cm)):
    for c in range(len(cm)):
        if (r==c):
            correct += cm[r,c]
In [33]:
accuracy = correct/sum(sum(cm))
accuracy
Out[33]:
0.9291338582677166

I ended up achieving an accuracy of 92.1% on the test data which is pretty great -- the original creators of the TrashNet dataset achieved a test accuracy of 63% with a support vector machine on a 70-30 test-train split (they trained a neural network as well for a test accuracy of 27%).

In [34]:
## delete everything when you're done to save space
shutil.rmtree("data")
shutil.rmtree('dataset-resized')

5. Next steps

If I had more time, I'd go back and reduce classification error for glass in particular. I'd also delete photos from the dataset that are overexposed, since those images are just bad data.

This was just a quick and dirty mini-project to show that it's pretty quick to train an image classification model, but it pretty amazing how quickly you can create a state-of-the-art model by using the fastai library. If you have an application you're interested in but don't think you have the machine learning chops, this should be encouraging for you.

Thanks to James Dellinger for this blog post about classifying bluejays. For more information about recycling, check out this FiveThirtyEight post.